REMARKS 



This paper is in response to the Office Action dated May 20, 2004 
regarding U.S. Patent Application Serial No. 09/927,041, filed August 9, 2001. 
There are currently 20 claims pending. Claims 10, 17 and 20 stand objected to 
under 37 C.F.R. 1.75(c) as being of improper dependent form. Claims 10, 17 and 
20 have been amended to place the claims in proper independent form. Claims 21 
and 22 have been added. Support for Claims 21 and 22 are in the originally filed 
claims and in the Specification on pages 7-10. Claims 1-2, and 4-7 stand rejected 
under 35 U.S.C. 102 (a) as being anticipated by Qian et al., U.S. Pat. No. 
6,721,454 (hereinafter Qian 454). Claims 3 and 8 stand rejected under 35 U.S. C. 
103(a) as being unpatentable over Qian 454. Claims 9, 11-15 and 16-20 stand 
rejected under 35 U.S.C. 103(a) as being unpatentable over Qian 454 in further 
view of Qian et al, U.S. Pat. No. 6,616,529 (hereinafter Qian 529). Applicant 
respectfully requests reconsideration in view of the foregoing amendments and 
the remarks herein below. 

Rejection of Claims L 2 and 4-7 under 35 U.S.C. 102(a) under Qian 454 
Qian 454 teaches a method for automatic extraction of semantically 
significant events from a video sequence. Column 2, lines 61- 66 of Qian 454 
describe the technique as detecting events by decomposing video into three levels. 
Importantly, the first level is "a video sequence 2 is input to the first level 4 of the 
technique where it is composed into shots." In contradistinction to Qian 454, 
Claim 1 teaches "obtaining unstructured video frames" Further, Column 3, lines 
35-50 further describe the first level of the technique as reiving on the input being 
a video sequence in that a video sequence is first defined as including "one or 
more scenes which, in turn, include one or more video shots." The fact that a 
sequence includes "scenes" of video contraindicates "unstructured video frames" 
as required in Claim 1. Further, Column 3, lines 35-50 teaches a sequence with 
scenes including video shots so that "the boundaries of the constituent shots of the 
sequence are detected 6." This teaches away from "unstructured" video frames as 
claimed. 

Claim 1 further teaches "extracting a set by processing pairs of segments 
for their visual dissimilarity and temporal relationship, and merging the video 
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segments by applying a probabilistic analysis to the extracted set to represent the 
video structure". 

Qian 454 fails to teach "extracting a set by processing pairs of segments 
for their visual dissimilarity and temporal relationship and merging the video 
segments by applying a probabilistic analysis to the extracted set" as required by 
Claim 1 . Qian 454, and specifically, Column 3, lines 1-8 teach classifying and 
summarizing shots in a video sequence, but not merging of video segments; and 
not merging by applying probabilistic analysis. Columns 3 and 4 describe how 
shot boundaries can be "forced" or "inserted" into a "sequence," contrary to 
Claim 1 requiring "merging". Moreover, the estimation for each pair of frames in 
a shot in Qian 454 is according to "global motion". Global motion is refers to the 
motion estimation determined in Qian 454 by applying a Gaussian technique as is 
known in the art using, inter alia, a normalized dot product calculation between 
frames. In fact, most of the information used in Qian 454 relates to motion 
detection and identifying content in a video sequence to segment a video sequence 
that contains homogeneous content. For example a hunt event is described in 
Column 11, lines 60-65. See also, e.g., claims 1, 5 and 8 in Qian 454. 

In contradistinction, Claim 1 provides for "merging video segments with a 
merging criterion that applies a probabilistic analysis to the feature set, thereby 
generating a merging sequence representing the video structure" which does not 
relate to summarizing or identifying content in video. In fact, Qian 454 teaches 
away from "merging" by teaching separating portions of a video for purposes of 
identifying an event. Accordingly, Qian 454 fails to teach or suggest the 
limitations of Claim 1 and Claim 1 is allowable. Claims 1 -9 depend from Claim 1 
and are allowable for at least this reason. 

Additionally, Claim 4 teaches eliminating "the presence of multiple 
adjacent shot boundaries" which is neither taught nor suggested in Qian 454. 
Rather, Qian 454, and specifically, Column 3, lines 45-50, teaches away from 
eliminating shot boundaries, and, instead, teaches detection of shot boundaries. 

Claim 7 as amended teaches "generating parametric mixture models to 
represent class-conditional densities of inter-segment features of the feature set; 
and applying the merging criterion to the parametric mixture models." Qian 454, 
in contrast, fails to teach inter-segment features as defined by the Applicant. 
Rather, Qian 454 teaches a color histogram technique used for detecting 
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boundaries and not inter-segment features as claimed . Rather, Qian 454 teaches 
applying probabilistic techniques to features in a frame. For example, Qian 454 
states "Fig. 6 illustrates feature space outputs from multiple color and texture 
filters applied to a video frame." Column 10, lines 31-32. 

Rejection of Claims 3 and 8 under 35 U.S.C. 103(a) under Qian 454 
Claim 3 teaches that "the difference signal is based on a mean 
dissimilarity determined over a plurality of frames centered on one of the 
consecutive frames and corresponding in number of frames to a fraction of the 
frame rate of video capture." As stated in the office action, Qian 454 fails to 
teach basing the number of frames used to calculate the difference signal on a 
fraction of the frame rate of video capture. A motivation cited by the office action 
for using a fraction of the frame rate is stated as a desire to shorten the time frame 
for calculating a difference signal. Contrary to the cited motivation, however, 
using a fraction of the frame rate of video capture would not shorten the time 
frame for calculating a difference signal. Rather, the use of a fraction of the 
frame rate could have the opposite of effect by causing a higher number of 
computations due to the reduced window size for calculating a difference signal. 
Therefore, Claim 3 is allowable over Qian 454. 

Claim 8 teaches "initializing the queue by introducing each feature into 
the queue with a priority equal to the probability of merging each corresponding 
pair of segments; depleting the queue by merging the segments if the merging 
criterion is met; and updating the model of the merged segment and then updating 
the queue based upon the updated model." Qian 454 fails to teach merging 
segments as discussed above, and further fails to teach updating a mode of a 
merged segment or depleting a queue by merging segments. Although official 
notice is relevant to queues for implementing hierarchical displays, Qian 454 in 
combination with the official notice fails to teach what Qian 454 lacks with 
respect to Claim 8. Accordingly, Claim 8 is believed allowable. 
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Rejection of Claims 9. 11-15 and 16-20 under 35 U.S.C. 103(a) under 
Oian 454 in combination with Oian 529 

Qian 454 in combination with Qian 529 fail to teach Claims 9, 11-15 and 
1 6-20. Qian 529 teaches a reality-based sports gaming network using a 
hierarchical event model that uses a probabilistic inferential Bayesian network 
that is trained using semantic events detected from a real version of a sports 
match. See abstract. The Bayesian network taught in Qian 529 is a dynamic 
network that operates on collected data related to sports events to automatically 
extract semantically meaningful events from such events. The hierarchical event 
model taught is a Bayesian network of graphical network nodes (See Column 4, 
lines 62-65) and are used because Bayesian networks support the use of 
probabilistic inference to update and revise belief values for qualitative 
inferences. The Bayesian network is called Bayesian because it performs 
statistical inferences based on Bayes' rule meaning that decision making is 
according to a probabilistic distribution function to determine whether a child 
node should follow the value of a parent node. See Column 4, lines 32-50. 

Claim 9 teaches representing the merging sequence in a hierarchical tree 
structure. Claims 11-15 and 16-20 teach, inter alia, "structuring video by 
probabilistic merging of video segments, said method comprising: a) obtaining a 
plurality of frames of unstructured video; 

b) generating video segments from the unstructured video by 
detecting shot boundaries based on color dissimilarity between consecutive video 
frames; 

c) extracting a feature set by processing pairs of segments for 
visual dissimilarity and their temporal relationship, thereby generating an inter- 
segment visual dissimilarity feature and an inter-segment temporal relationship 
feature; 

d) generating a parametric mixture model of the inter-segment 
features comprising the feature set; and 

e) merging video segments with a merging criterion that applies a 
probabilistic Bayesian analysis to the parametric mixture model, thereby 
generating a merging sequence representing the video structure. 



The Bayesian analysis taught in Claims 11-15 and 16-20 applies a 
Bayesian analysis to a parametric mixture model of inter-segment features. Qian 
454, as described above, teaches an event inference module that allows 
descriptors to identify shots and enable video database indexing, retrieval and 
browsing. Part of Qian 454 descriptors includes temporal descriptors of objects 
and temporal relations of video. Qian 529 could be combined with Qian 454 to 
provide further event identifications such as the "hunt event" mentioned in Col. 
11, line 60 and further identified in Qian 529, Column 2, lines 34-36. 

Although event identification for purposes of a gaming network or for 
extracting events from video can benefit from a Bayesian network method, neither 
Qian 454 or Qian 529 either alone or in combination teach applying Bayesian 
analysis to a parametric mixture model of inter-segment features. Importantly, 
Claims 11-15 and 16-20 apply the Bayesian analysis to generate a merging 
sequence representing the video structure and not to enable descriptors or identify 
child node values of a network. Similarly, although Qian 529 teaches a 
hierarchical network, Qian 529 fails to teach a hierarchical tree structure for 
merging sequences. The merging sequence is defined in Claim 1 1 as being 
generated using Bayesian analysis. As one of skill in the art will appreciate, using 
Bayesian analysis to create merging sequences and using a hierarchical tree 
structure as claimed is fundamentally different from creating a hierarchical 
network by applying Bayesian decision theory for parent-child relationships in the 
network nodes. Accordingly, Claims 9 and 1 1 -20 are allowable over the Qian 454 
and Qian 529 either alone or in combination. Claims 12-16 depend from Claim 
1 1 , and are allowable for at least this reason. Claims 1 7, 1 8 and 20 each teach 
applying Bayesian analysis as described including limitations included in Claim 
11, "merging video segments with a merging criterion that applies a probabilistic 
Bayesian analysis to the parametric mixture model, thereby generating a merging 
sequence representing the video structure" and are allowable for the same reasons 
as described with regard to Claims 9 and 1 1 . Claim 19 depends from Claim 18 
and is allowable with Claim 19. 

Conclusion 

Claims 1-20 are pending. Claims 21 and 22 have been added. No 
new matter has been added thereby. The objection to Claims 1 0, 17 and 20 under 
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37 C.F.R. 1 .75(c) as being of improper dependent form as been addressed by 
amendments to the claims. The rejection of Claims 1-2, and 4-7 under 35 U.S.C. 
102 (a) has been traversed. The rejection of Claims 3 and 8 under 35 U.S. C. 
103(a) has been traversed. The rejection of Claims 9, 1 1-15 and 16-20 under 35 
U.S.C. 103(a) has been traversed. 

It is respectfully submitted, therefore, that in view of the above 
amendments and remarks, that this application is now in condition for allowance, 
prompt notice of which is earnestly solicited. 



Respectfully submitted, 
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